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Abstract. Church's Higher Order Logic is a basis for influential proof assistants — HOL 
and PVS. Church's logic has a simple set-theoretic semantics, making it trustworthy and 
extensible. We factor HOL into a constructive core plus axioms of excluded middle and 
choice. We similarly factor standard set theory, ZFC, into a constructive core, IZF, and 
axioms of excluded middle and choice. Then we provide the standard set-theoretic se- 
mantics in such a way that the constructive core of HOL is mapped into IZF. We use the 
disjunction, numerical existence and term existence properties of IZF to provide a program 
extraction capability from proofs in the constructive core. 

We can implement the disjunction and numerical existence properties in two different 
ways: one using Rathjen's realizability for IZF and the other using a new direct weak 
normalization result for IZF by Moczydlowski. The latter can also be used for the term 
existence property. 



Church's Higher-Order logic [Chu40l lLei94] has been remarkably successful at capturing 
the intuitive reasoning of mathematicians. It was distilled from Principia Mathematica, and 
is sometimes called the Simple Theory of Types based on that legacy. It incorporates the 
A calculus as its notation for functions, including propositional functions, thus interfacing 
well with computer science, where the A calculus is fundamental. 

One of the reasons Higher-Order logic is successful is that its axiomatic basis is very 
small, and it has a clean set-theoretic semantics at a low level of the cummulative hierarchy 
of sets (up tow + w) and can thus be formalized in a small fragment of ZFC set theory. This 
means it interfaces well with standard mathematics and provides a strong basis for trust. 
Moreover, the set theory semantics is the basis for many extensions of the core logic; for 
example, it is straightforward to add arrays, recursive data types, and records to the logic. 

Church's theory is the logical basis of two of the most successful interactive provers 
used in hardware and software verification, HOL [GM93j and PVS |ORS92j . This is due in 
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part to the two characteristics mentioned above in addition to its elegant automation based 
on Milner's tactic mechanism and its elegant formulation in the ML metalanguage. 

Until recently, one of the few drawbacks of HOL was that its logical base did not 
allow a way to express a constructive subset of the logic. This issue was considered by 
Harrison for HOL-light [Har96| , and recently Berghofer implemented a constructive version 
of HOL in the Isabelle implementation |Ber041 1BN02] in large part to enable the extraction 
of programs from constructive proofs. This raises the question of finding a semantics for 
HOL that justifies this intuitively sound extraction. 

The standard justification for program extraction is based on logics that embedded 
extraction deeply into their semantics; this is the case for the Calculus of Inductive Con- 
structions (CIC) [CPM901 IBC04j . Minlog [BBS+98] . Computational Typ e Theory (CTT ) 
[ABC+061 IC+86] or the closely related Intuitionistic Type Theory (ITT) [ML821 |NFS90] . 
The mechanism of extraction is built deeply into logic and the provers based on it, e.g. 
Agda |ACN90j on ITT, Coq [The04| on CIC, MetaPRL |HNC+03l and Nuprl jACE+00] on 
CTT. 

In this paper we show that there is a way to provide a clean set-theoretic semantics 
for HOL and at the same time use it to semantically justify program extraction. The idea 
is to first factor HOL into its constructive core, say Constructive HOL, plus the axioms of 
excluded middle and choice. The semantics for this language can be given in ZFC set theory, 
and if that logic is factored into its constructive core, called IZF, plus excluded middle and 
choice (choice is sufficient to give excluded middle), then in the standard semantics, IZF 
provides the semantics for Constructive HOL. Moreover, we can base program extraction 
on the IZF semantics. 

The constructive content of IZF is not as transparent as in the constructive set theory 
CZF of Aczel |Acz78] . as he is able to interpret CZF in Type Theory, while no such interpre- 
tation is known for IZF. However, it is not possible to express the impredicative nature of 
Higher-Order Logic in CZF. Also, IZF is not as expressive as Howe's ZFC |How961 IHow98| 
with inaccessible cardinals and computational primitives, but this makes IZF a more stan- 
dard theory. 

Our semantics is appealing not only because it factors so elegantly, but also because the 
computational issues and program extraction can be reduced to the standard constructive 
properties of IZF — the disjunction, numerical existence and term existence properties. 

We can implement the disjunction and numerical existence properties in two differ- 
ent ways: one using Rathjen's realizability for CZF |Rat05] . recently extended to IZF 
|Rat06| , and the other using a new direct weak normalization result for IZF by Moczydlowski 
[Moc06a| Moc06b] . The latter can also be used for the term existence property. 

In this paper, we provide a set-theoretic semantics for HOL which has the following 
properties: 

• It is as simple as the standard semantics, presented in Gordon and Melham's |GM93| . 

• It works in constructive set-theory. 

• It provides a semantical basis for program extraction. 

• It can be applied to the constructive version of HOL recently implemented in Isabelle- 
HOL as a means of using constructive HOL proofs as programs. 

This paper is organized as follows. In section [2] we present a version of HOL. In section 
[3] we define set-theoretic semantics. Section H] defines constructive set theory IZF and states 



its main properties. We show how these properties can be used for program extraction in 
section [5j 



2. Higher-order logic 

In this section, we present in detail higher-order logic. There are two syntactic cate- 
gories: terms and types. The types are generated by the following abstract grammar: 

r ::= not \ bool \ prop | r — > r | r x r 

The distinction between bool and prop corresponds to the distinction between the two- 
element type and the type of propositions in type theory, or between the two-element 
object and the subobject classifier in category theory or, as we shall see, between 2 and the 
set of all subsets of 1 in constructive set theory. 

The terms of HOL are generated by the following abstract grammar: 

t .. X T | C T | (t T — > CT U t )q- I (Ax T . t a ) T — >cr | (^t; ^oOtXct 

Thus each term t a in HOL is annotated with a type a, which we call the type of t. We 
will often skip annotating of terms with types, this practice should not lead to confusion, 
as the implicit type system is very simple. Terms of type prop are called formulas. 

The free variables of a term t are denoted by FV{t) and defined as usual. We consider 
a-equivalent terms equal. The notation t[x := u] stands for a capture- avoiding substitution 
and denotes the result of substituting u for x in the term t. 

Our version of HOL has a set of built-in constants. To increase readability, we write 
c : t instead of c T to provide the information about the type of c. If the type of a constant 
involves a, it is a constant schema, there is one constant for each type r substituted for a. 
There are thus constants =booi, =nat and so on. 

_L : prop T : prop = a : a x a — > prop 
— >: prop x prop — > prop A : prop x prop — > prop V : prop x prop — > prop 
V a : (a — > prop) — ► prop 3 a : (a — ► prop) — ► prop e a : (a — ► prop) — > a 
: nat S : raa£ — > nat false : bool true : bool 
We present the proof rules for HOL in a sequent-based natural deduction style. A 
sequent is a pair (r, t), where T is a list of formulas and t is a formula. The free variables of 
a context are the free variables of all its formulas. A sequent (T, t) is written as T h t. We 
write binary constants (equality, implication, etc.) using infix notation. We use standard 
abbreviations for quantifiers: Va : r. eft abbreviates V T (Aa r . <fi), similarly with 3a : r. (p. The 
proof rules for HOL are as follows: 



T h t T h t = t r h Xx T . t = Xx T . s 

Tht Ths ThtAs ThtAs 



ThtAs rhi Ths ThT 

r\-t r\-s rhtvs r,th-u r,shu 



ThtV s ThtVs Thn 



rhMs T\-t T\-t[x := s] 

r I - fa^prop ta T \~ 3 a {fa— >prop) T, fa^prop x a ^~ U 



r i~ 3q, (fa—>prop) r H 1i 



Xr, new 



Finally, we list HOL axioms. 

(1) (FALSE) _L = V6 : prop. b. 

(2) (FALSENOTTRUE) false = true -> _L. 

(3) (BETA) (Xx T . t a )s T = t a [x T := s T ]. 

(4) (ETA) (Xx T . f T ^ a x T ) = / T _> CT , where x $ FV(f). 

(5) (FORALL) V a = XP a ^ prop . (P = Xx a . T). 

(6) (P3) Vn : nat. (0 = S(n)) -» _L. 

(7) (P4) Vn, m : nat. <S(n) = S(m) — > n = m. 

(8) (P5) VP : nat -> prop. P(0) A (Vn : nat. P(n) -> P(5(n))) -> Vn : nat. P(n). 

(9) (BOOL) Vx : feooZ. (x = false) V (x = true). 

(10) (EM) Vx : prop, (x = _L) V (x = T). 

(11) (CHOICE) VP : a — > prop. Vx : a. P x -> P(e {a ^ prop) ^ a (P)). 

Our choice of rules and axioms is redundant. Propositional connectives, for example, 
could be defined in terms of quantifiers and bool. However, we believe that this makes 
the account of the semantics clearer and shows how easy it is to define a sound semantics 
for such system. Our presentation is based on the core part of the theory of |GM93j . It 
does not include type definitions and parametric polymorphism. We believe extending it to 
incorporate these features should not be very difficult. 

The theory CHOL (Constructive HOL) arises by taking away from HOL the axioms 
(CHOICE) and (EM). 

We write \~h <P anci <& to denote that HOL and CHOL, respectively, proves <p. We 
will generally use letters V, Q to denote proof trees. A notation V \~c <p means that V is a 
proof tree in CHOL of <p. 

3. Semantics 

3.1. Set theory. The set-theoretic semantics needs a small part of the cumulative hierarchy 
— Ru+uj is sufficient to carry out all the constructions. The Axiom of Choice is necessary in 
order to define the meaning of the e constant. For this purpose, C will denote 80 necessarily 
non-constructive function such that for any 1,7 6 Ruj+ui '■ 

• If X is non-empty, then C(X, Y) G X. 

• If A is empty and Y is non-empty, then C(X,Y) 6 Y. 

• Otherwise, C(X, Y) is 0. 

Recall that in the world of set theory, = 0, 1 = {0} and 2 = {0, 1}. Classically P(l), 
the set of all subsets of 1, is equal to 2. This is not the case constructively; there is no 
uniform way of transforming an arbitrary subset of 1 into an element of 2. In fact, it is easy 
to see that P(l) = 2 entails the law of excluded middle: 

Lemma 3.1. If P(l) = 2, then for any <p, (f> or —xp. 

Proof. Suppose P(l) = 2 and take a formula <p. Consider A = {x S 1 \ <fi} and B = {x € 
1 | -10}. Since AuB e P(l), A U B € 2, so either A U B = or A U B = 1. In the former 
case, ^ A and ^ B. Then we have —>(j) because from (ft we obtain G A, which is a 
contradiction. But we also have —i—Kp because from —up we obtain € B, which is also a 

■'■Note that if we want to pinpoint C, we need to assume more than AC, as the existence of a definable 
choice function for Ru+u is n °t provable in ZFC. 



contradiction. Thus we have refuted the assumption A U B = 0, so A U B = 1. Therefore 
6 A U B, so either G j4 in which case 0, or E B in which case —>([). So either (f> or □ 

The following helpful lemma, however, does hold in a constructive world: 

Lemma 3.2. If A G P(l), then A = 1 iff € A. 

Let us also define precisely the function application operation in set theory. We borrow 
the definition from [Acz99] . 

App(f, x) = {z | 3y. z G y A (x, y) G /} 

The advantage of using this definition over an intuitive one ("the unique y such that (x, y) £ 
/") is that it is defined for all sets / and x. Partiality of App would entail serious problems 
in the constructive setting. This definition is equivalent to the standard one when / is a 
function: 

Lemma 3.3. If / is a function from A to B and x G A, then App(f, x) is the unique y such 
that (x,y) G /. 

Proof. Let y be the unique element of B such that (x, y) G /. If z G App(f, x) then there 
is y' such that z £ y' and (x, y') G /. Since y' = y, z G y. For the other direction, if z G y, 
then obviously 2 G App(f,x). □ 

From now on, the notation f(x) means App(f,x). We will also use a lambda notation 
in set theory to define functions: Ax G A. B{x) means {(x,B(x)) \ x G ^4}. 

3.2. The definition of the semantics. We first define a meaning [t] of a type r by 
structural induction on r. 

• [natj = N. 

• \bool\ = 2. 

. {prop} = P(l). 

• [r x a] = [r] x [c], where ixB denotes the cartesian product of sets A and B. 

• [ r i "~ * T 2] = [ r il - ► [72]) where A — ► P denotes the set of all functions from A to B. 

The meaning of a constant c a is denoted by [c a ] and is defined as follows. 

• [=J = A(xi,x 2 ) G [a] x [a], {x G 1 | sci = x 2 }. 

• M = fo) G [prop] x (prop}. {xGl|xG6i^xG 6 2 }- 

• [V] = X(b 1 ,b 2 ) G [prop] x [prop]. 61 U6 2 - 

• [A] = A(6i,6 2 ) G {prop} x [prop]. b 1 Db 2 . 
. I/a/se] = [±] = 0. 

. {true} = [T] = 1. 

. [Vj = A/G[a]^[prop]. RaeH /(«)• 
. p a ] = A/e[a]-> [prop]. UaeM/W- 
. [e a j = AP G [a] -> [prop]. ^p-Hil}), [a]). 

• [0] = 0. 

• [5] = A?i G N. n + 1 

Standard semantics, presented for example by Gordon and Melham in |GM93j . uses a 
truth table approach — implication (f> — > ^ is false iff is true and ip is false etc. It is easy 
to see that with excluded middle, our semantics is equivalent to the standard one. 



Lemma 3.4 (ZF). For any A, B G P(l), B) = iff A = 1 and B = 0. 



Proof. Suppose B) = 0. Then {x£l\x£A^x£B} = 0, so <^ {z G 1 | x G 

B}, so it is not the case that e A ^ e B, so G A and ^ B. Thus, A = 1 
and B = 0. The other direction is easy. □ 

The definition of our semantics is not original. The meaning of logical constants is 
essentially a combination of the fact that any complete lattice with pseudo-complements is 
a model for higher-order logic and that P(l) is a complete lattice with pseudo-complement 
defined in the clause for — ► |RS63j . Similar semantics for HOL have also been provided in 
category-theoretical setting [LS86J . The novelty of our approach lies in utilizing this kind 
of semantics for the purpose of program extraction in Section [5j 

To present the rest of the semantics, we need to introduce environments. An envi- 
ronment is a function from HOL variables to sets such that p(x T ) G [r]. We will use the 
symbol p exclusively for environments. The meaning [i]L of a term t is parameterized by 
an environment p and defined by structural induction on t: 

• [Cr]p = [CrI- 

• l x Ap = P{X T )- 

• Is u} p = App(lsj p , {uj p ). 

• \Xx T . uj p = {(a, luj p[xT . =a ]) | a G [r]}. 

• i(s,u)}p = (ls}p,lu}p). 

3.3. The properties of the semantics. There are several standard properties of the 
semantics we have defined. 

Lemma 3.5 (Substitution Lemma). For any terms t, s and environments p, [ilp^—p] ] = 
{t[x := s]j p . 

Proof. By structural induction on t. Case t of: 

• c — the claim is obvious. 

• x. Then [x] p[2 . :=Wp] = [s] p = {x[x := s]j p . 

• u v. Then [u v] p[x:=M] = App(lu] p[x]=Mp] ,lvjp [x:=lslp] ). By the inductive hypothesis, 
this is equal to ^4pp([«[x := s]J p , \v[x := s]J p ) = \u[x := s] v[x := s]] p = {t[x := s]} p . 

• (u,v). Similar to the previous case. 

• Xy T . u. Without loss of generality we may assume that y £ {x}(JFV(s). Then [t] p [ x:=s ] = 
{(a, lujp[ x: —[ s j ][y : — a ]) | a G [r]}. By the inductive hypothesis, this is equal to {(a, \u[x := 
»]W=a]) I a e It}} = l(\y T . u[x := s])j p = [t[x := s]j p . □ 

Lemma 3.6. For any type a, 3x. x G [a]. 

Proof. Easy. □ 
Lemma 3.7. If x a FV(t), then for any b G [a], [t] p = ltj p [ Xa .- b ]. 

Proof. Straightforward induction on t. We only show the case when t = Xy T . u. Without 
loss of generality we can assume that y ^ x. We have [i] p = {(a, [u]p[j, : = a ]) I a €E [t]}. 
Since j; ^ FV(u), by the inductive hypothesis this is equal to {(a, [u] p [ y . =a ][ x . = f,]) | a G [t]}. 
Since x ^ y, this is also equal to {(a, Mp[z : =6][j/:=a]) I a G [r]} = [Ay T . n] p[x:=f) ]. □ 



Lemma 3.8. For any p, [i a J p G [a]. 
By induction on t. Case t of: 

• x T . The claim follows by the definition of environments. 

• c T . We proceed by case analysis of c. We show the interesting cases. 

— V a . The type of c is (a — > prop) — ► prop. We need to show that if / is a function from 
[a] to P(l), then f| aeM f(a) is in P(l). Since for any a £ [a], /(a) G P(l) and P(l) 
is closed under intersections, the claim follows. 

— 3 Q . The proof is similar and follows by the fact that P(l) is closed under unions. 

— e a . The type of e a is (a — > prop) — ► a. Take any function P from [a] to P(l). Then 
P" 1 ^ 1 }) £ [a]. By the definition of C, if P" 1 ^ 1 }) ^ 0, then [eJ(P) G [a]. So 
suppose P _1 ({1}) = 0. By Lemma [3UI [a] is not empty, so by the definition of C, 
[eJ(P) G H as well. 

In particular, this implies that for any formula t, pL C 1. So if we want to prove that 
[i]p = 1, then by Lemma I3T21 it suffices to show that G {t} p . 



3.4. Soundness. The soundness theorem establishes validity of the proof rules and axioms 
with respect to the semantics. 

Definition 3.9. We write [Tj p = 1 if {tij p = 1, . . ., {t n jp = 1, where T = h,t 2 ,.. .,<„. 
Theorem 3.10 (Soundness). IfTht then for any p, if [r] p = 1, then ftjp = 1. 
Proof. Straightforward induction on r h t. We show several interesting cases. 



The claim is trivial. 

Tht = s 

r h \x T . t = \x T . s 

We need to show that {(o, {tj p [ XT . =a] ) \ a G [r]} = {(a, [s] p[:Cr:=a ]) | a G [r]}. That is, 
that for any a G [r], Mp^^o] = W P [x r :=a]- Let p' = p[x T := a]. We get the claim by 
the inductive hypothesis. 

T,t\-s 

Suppose [r] p = 1. We need to show that G {x G 1 | x G [t] p — > x G [s] p }. Since 
G 1, assume G [i] p . Then [T, t] p = 1. By the inductive hypothesis [s] p = 1 thus also 
G {sj p . 

Tht^s Tht 
T h s 

Suppose [r]p = 1. By the inductive hypothesis, G {x G 1 | x G [£]p — > a; G [sj p } and 
G [tip, so easily G [sip. 

r h s = u r h t[x := u] 
r h t[s := s] 

Assume [r] p = 1. By the inductive hypothesis, [sj p = [n] p and [t[x := n]] p = 1. Using 
the Substitution Lemma we get [t[x := u]\ p = [t\ p[x , =Mp] = [t]p [x .. =Wp] = [t[x := s]j p . 



r\-ft a 

r H 3 Q (/a— >prop) 

Assume [r] p = 1. We have to show that € UaeH] (l/lp( a ))' so that there is a G [a] 
such that € [/]p(a). By Lemma ESJ [i a ]p G [a], so taking a = {t a } p we get the claim 
by the inductive hypothesis. 

• 

r h~ 3 a \ f a—*nronj T, f X a \~ U 

~ — x a new 

r h u 

Suppose [r]p = 1. By the inductive hypothesis, there is a € [a] such that G |/]p(a). 
Let p' = p[x a := a]. By the inductive hypothesis we get G [u} p i. As x a £ FV(u), by 
Lemma [3771 [u] p = 1. □ 

Having verified the soundness of the HOL proof rules, we proceed to verify the soundness 
of the axioms. 

Theorem 3.11. For any axiom t of HOL and any p defined on FV(t), G [i] p . 
Proof. We proceed axiom by axiom and sketch the respective proofs. 

• (FALSE) [_Ljp = = flaeP(i) a = [V6 : prop. bj p . The second equality follows by 
G P(l). 

• (BETA) We have [(Ax T . t ff ) s r ] P = App([Ax T . t a ] p , [s T ] p ) = App({(a, M p[a;:=a ]) | a G 
H}, [sr] P ) = [*]p[x T :=[s T ] p ] = (by the Substitution Lemma) = \t a [x T := s T ]\ p . 

• (ETA) [AX T . fr^aX T \p = {(«,[/ Xrlp^ :=o] ) I a G HI = {i®, A PP(lflp[x T :=a] , «)) I « G 

[r]} = (since x r ^ FV{f)) = {(a, [/] p (a)) | a G [r]} = [/] P , as by Lemma EH [/]p G 
[r] — > [<r] and functions in set theory are represented by their graphs. 

• (FORALL) We have: 

iVjp = {(F, f| F(o)) |F6[aH 

a6[o] 

Furthermore: 

IAF Q _ prop . F = \x a . Tip = {(F, {z G 1 | F = Ax G [a]. 1}) | F G [a] -> F(l)} 
So take any F G [a] — > F(l). It suffices to show that flaeM -^X a ) = G 1 | F = 
Ax G [aj. 1}. We have x G Dae [a] -^X a ) ^ ^ or an a G Mi x G F(a) and x = 0. This 
happens if and only if x = and for all a G [a], F(a) = 1 which is equivalent to 
xG{zGl |F = AxG [a]. 1}. The claim follows. 

• The axioms F3, F4, F5 follow by the fact that natural numbers satisfy the respective 
Peano axioms. 

• (BOOL) We need to show that f^booi- (^ x booi- x = false V x = true)\ p = 1. Unwinding 
the definition, this is equivalent to Pl^U- 2 Gl|x = 0}U{zGl|x = 1}) = 1. 
and furthermore to: for all x G 2 and y, y G {z G 1 | x = 0} U {z G 1 | x = 1} iff 
y = 0. Take any x G 2 and y. The left-to-right direction is obvious, for the right-to-left 
direction, either x = or x = 1. In the former case, G {z G 1 | x = 0}, in the latter 

G {z G 1 | x = 1}. 

• (EM) We need to show that [V prop . (Ax prop . x = _L V x = T)] p = 1. Reasoning as in 
the case of (BOOL), we find that this is equivalent to: for all x G F(l) and y, y G {z G 

1 | x = 0} U {z G 1 | x = 1} iff y = 0. Suppose x G F(l). At this point, it is impossible 



• Extensionality Two sets are equal if they have the same elements. 

• Empty Set There is an empty set. 

• Pairing For any sets a, b, there is a set consisting of a and b. 

• Infinity There is a set closed under the successor operation and containing the empty 
set. 

• Union For any set a, there is a set (J o which is a union of all elements of a. 

• Power Set For any set a, there is a set of all subsets of a. 

• Separation For any formula (p, for any set a, there is a set of all elements of a satisfying 

<f>. 

• Replacement For any formula (p(x,y,z), for any set a, if for all x G a there is exactly 
one y such that 4>(x, y,z) holds, then there is a set b such that for all x G a there is y G b 
such that <^>(x,y,z) holds. 

• G-Induction For any formula (j)(a,z), if for all sets b (Vx G b.(p(x,z)) implies (p(b,z), then 
for all a, (f>(a,z) holds. 



Figure 1: The axioms of IZF with Replacement 



to proceed further constructively, all we know is that x is a subset of 1, which does not 
provide enough information to decide whether x = or x = 1. However, classically, using 
the rule of excluded middle, P(l) = 2 and we proceed as in the previous case. 
• (CHOICE) We argue classically, so in particular -P(l) = 2. We need to show that: 

lV a -> prop (\P a ^ prop . V a (Ax Q . Px -> -P(£( Q: -p r0 p)_ > a(P))l = 1, which is equivalent to 
ripe[a]^2l V «( A;r a- Px -> P{£(a^ P rop)^ a (P))j = 1, which is equivalent to 

rip e [a]^2 flzeH l Px p ( e (a^prop)^a(^))] = 1, which is equivalent to 

p| p| {a € 1 | a e P(x) -> a G P{C(P'\{1}), [a]))} = 1. 
Pe[a]-+2a;e[a] 

To show this, it suffices to show that for all P G [a] — > 2, for all x € [a], if G P{x) 
then G P(C(P- 1 ({1}), [a])). Take any P and x. Suppose G P{x). Then P(x) = 1, 
so x G P _1 ({1}). Therefore C{P~ l {{\}), [a])) G P -1 ({1}), so P(C(P- 1 ({1}), [a]) = 1, 
which shows the claim. □ 

Corollary 3.12. HOL is consistent: it is not the case that \~h -L- 

Proof. Otherwise we would have [_L] = [T], that is = 1. □ 



4. IZF 



The essential advantage of the semantics in the previous section over a standard one 
is that for the constructive part of HOL this semantics can be defined in constructive set 
theory IZF. 

An obvious approach to creating a constructive version of ZFC set theory is to replace 
the underlying first-order logic wi th in tuitionistic first-order logic. As many authors have 
explained |Myh73 , IBee851 IMcC861 IS85] , the ZF axioms need to be reformulated so that they 
do not imply the law of excluded middle. 

In a nutshell, to get IZF from ZFC, the Axiom of Choice and Excluded Middle are 
taken away and Foundation is reformulated as G-induction. The axioms of IZF are thus 



Extensionality, Union, Infinity, Power Set, Separation, Replacement or Collectioifl and G- 
Induction. The list of axioms for the version with Replacement can be found in Figure 1. 
A detailed account of the theory c an b e found for example in Friedman [Fri73j . Besoon's 
book |Bee85| and Scedrov's paper S85 contain a lot of information on metamathematical 
properties of IZF and related set theories. For convenience, we assume that the first-order 
logic has built-in bounded quantifiers (Vx G a. (j) and 3x 6 a. 4>), defined as abbreviations 
in the standard way. We also include in the signature all the set terms corresponding to 
the axioms of IZF — N, |J t, P(a) etc. For the full list, see |Moc07j . 
Myhill ||M yh73| have proved several important properties of IZF: 

• Disjunction Property (DP) : If IZF h cf> V tp, then IZF h <p or IZF h tp. 

• Numerical Existence Property (NEP) : If IZF h 3x G N. 4>(x), then there is a natural 
number n such that IZF h 0(n), where n = S(S(. . .(0))) and S(x) = iU {x}. 

• Term Existence Property (TEP) : If IZF h 3x. 4>(x), then for some term t, IZF h 4>(t). 

Moreover, the semantics and the soundness theorem for CHOL work in IZF, as neither 
Choice nor Excluded Middle are necessary to carry out these developments. Note that the 
existence of -P(l) is crucial for the semantics. 

All the properties are constructive — there is a recursive procedure extracting a natural 
number, a disjunct or a term from a proof. A trivial one is to look through all the proofs 
for the correct one. For example, if IZF h <j>\/ ip, a procedure could enumerate all theorems 
of IZF looking for either 4> or ip; its termination would be ensured by DP. We discuss more 
efficient alternatives in section [531 



5. Extraction 

We will show that the semantics we have defined can serve as a basis for program 
extraction from proofs. All that is necessary for program extraction from constructive HOL 
proofs is provided by the semantics and the soundness proof. Therefore, if one wants to 
provide an extraction mechanism for the constructive part of the logic, it may be sufficient 
to carefully define set-theoretic semantics, prove the soundness theorem and the extraction 
mechanism for IZF would take care of the rest. We speculate on practical uses of this 
approach in section [6j 

5.1. IZF Extraction. We first describe extraction from IZF proofs. To facilitate the 
description, we will use a very simple fragment of type theory, which we call TT°. 

The types of TT° are generated by the following abstract grammar. They should not 
be confused with HOL types; the context will make it clear which types we refer to. 

r ::= * | Pffy | not \ bool \ txt\t + t\ t— > t 

We associate with each type r of TT° a set of its elements, which are finitistic objects. 
The set of elements of r is denoted by EI(t) and defined by structural induction on r: 

• El{*) = {*}. 

• E^P^) is the set of all IZF proofs of the formula 4>. 

• El(nat) = N, the set of natural numbers. 

• El(bool) = {true, false}. 



There is a difference, in particuiar the version with Coiiection does not satisfy Term Existence Property 
(TEP), defined on the next page. A concerned reader can repiace IZF with IZF_r whenever TEP is used. 



• El( Tl x r 2 ) = El{n) x El(r 2 ). 

• Me EI(t\ + T2) iff either M = inl(M{) and Mi G or M = mr(Mi) and Mi 6 
£/(r 2 ). 

• M £ EI{t\ — ► T2) iff M is a method which given any element of EI{t\) returns an element 
of EI(t 2 ). 

In the last clause, we use an abstract notion of "method". It will not be necessary 
to formalize this notion, but for the interested reader, all "methods" we use are functions 
provably recursive in ZF + Con(ZF), where Con(ZF) denotes consistency of ZF. 

The notation M : r stands for M G EI{t). 

We call a TT° type pure if it does not contain * and P^. There is a natural mapping 
of pure types TT° to sets. It is so similar to the meaning of the HOL types that we will 
use the same notation. 

• [natj = N. 

• \bool\ = 2. 

. [t x aj = [r] x {a}. 

• [r + a] = [r] + [cr], the disjoint union of [r] and [cr]. 

• [T-<7] = [T]-I<7]. 

If a set (and a corresponding IZF term) is in a codomain of the map above, we call it 
type-like. If a set A is type-like, then there is a unique pure type r such that [r] = A. We 
denote this type Type(A). Thus, type-like sets are these "generated" by pure TT° types 
via natural semantics. Formally, we define a recursive set TL of IZF terms such that for 
any t G TL, t is type-like and we can find effectively Type(A). The definition of TL follows 
the definition above: TL is the smallest set such that N, 2 € TL and if t, u G TL, then txu, 
t + u and t — > u are also elements of TL. Thus, the sentence "A is type-like" stands for 
11 A 6 TL" . Note that for any term t € TL we can find a term i' such that IZF h t = t' and 
t' (£TL — it suffices to take t' = t U 0. 

Before we proceed further, let us extend TT° with a new type Q T , where r is any pure 
type of TT°. Intuitively, Q T is the provable counterpart of [r]. Formally, the members of 
El(Q T ) are pairs (t,V) such that P K /ZjF t G [r] is an IZF proof of t G [r]). Note that 
there is a natural mapping from closed HOL terms M of type r into Q r — it is easy to 
construct using Lemma l3?8l a proof V of the fact that [M] p G [r], so the pair ([Mjp, P) : Q T . 
In particular, any natural number n can be injected into Q na t- The set of pure types stays 
unchanged. 

We are going to tailor extraction from IZF proofs to the HOL logic. For this purpose, we 
will specify which elements of IZF proofs/formulas carry interesting computational content 
for us. We will use the type * to mark the parts of proofs we are not interested in. 

We first define a helper function T, which takes a pure type r and returns another type. 
Intuitively, T(r) is the type of the extract from a statement 3x. x G [r]. The function T is 
defined by induction on r: 

• T(bool) = bool. 

• T(nat) = nat. 

• T(r x a) = T(r) x T(a). 

• T{t + o) =T(r)+T(a). 

• T(t — > cr) = Q T — > T(a). The rationale for this definition is that in order to utilize 
an IZF function from [r] to [cr] we need to supply an element of a set [r], which is an 
element of Q T . 



Furthermore, we assign to each formula of IZF a XT type 0, which intuitively 
describes the computational content of an IZF proof of 0. We do it by induction on 0: 

• a G 6 = *. 

• a = 6 = * (atomic formulas carry no useful computational content). 

• 01 V 02 = 01 + 02 . 

• 01 A 02 = 01 X 02-_ 

• 01 — > 02 = P^i — » 02- 

• 3a G A. 0i = T(Type(A)) x 0^, if A is type-like. 

• 3a G A 0i = *, if A is not type- like. 

• 3a. 0i = *. 

• Va G A. 0i = QType(A) —> 0i, if -A is type-like. 

• Va G A. 0i = *, if A is not type- like. 

• Va. 0i = *. 

The definition is tailored for HOL logic and could be extended to allow meaningful 
extraction from a larger class of formulas. For example, we could extract a term from 
3a. 0i using Term Existence Property. 

We present several natural examples of our translation in action: 

(1) 3a; G N. x = x = nat x *. 

(2) Vx G N. 3y G N. = Q nat nat x 0. 

(3) V/ G N N. 3x G N. f(x) = = Q nat ^ n at -> nat x *. 

These types are richer than what we intuitively would expect — nat in the first case, 
nat — > nat in the second and (nat — > nat) — > nat in the third, because any closed HOL 
term of type nat or nat — > nat can be injected into Q na t or Qnat^nat via the soundness 
theorem. The extra * can be easily discarded from types (and extracts). 

Lemma 5.1. For any IZF term t, which is not type-like, 0[a := t] = 0. 

Proof. Straightforward induction on 0. □ 

Lemma 5.2 (IZF). (3a G 2. 0(a)) iff 0(0) V 0(1). 

We are now ready to describe the extraction function E, which takes an IZF proof V 
of a formula and returns an object of TT° type 0. We do it by induction on 0, checking 
on the way that the object returned is of type 0. Recall that DP, TEP and NEP denote 
Disjunction, Term and Numerical Existence Property, respectively. Case of: 

• a G b — return *. We have * : *. 

• a = b — return *. We have * : *, too. 

• 0i V 02- Apply DP to V to get a proof V\ of either 0i or 02. In the former case return 
inl(E(V~i)), in the latter return inr(E(Vi)). By the inductive hypothesis, E(Vi) : 0i (or 
E{Vi) : 02"), so E(V) : follows. 

• 01 A 02- Then there are proofs V\ and V2 such that V\ h 0i and V2 \~ 02- Return 
a pair (E(Vi), E(J > 2)). By the inductive hypothesis, E(Vi) : 0i and E{J > 2) '■ 02, so 
{E(Vi),E(V 2 )):fr~Afo. 

• 0i — ► 02- Return a function G which takes an IZF proof Q of 0i, applies V to Q (using 
the modus-ponens rule of the first-order logic) to get a proof 7Z of 02 and returns E(1Z). 
By the inductive hypothesis, any such E(1Z) is in El (fa), so G : — ► 02- 

• 3a G A. 0i (a), where A is type-like. Let T = Type(A). We proceed by induction on T. 
Case T of: 



— bool. By Lemma 15.21 we have fa{0) V fa{i). Apply DP to get a proof Q of either 
c/>i(0) or fa(l). Let b be false or irue, respectively. Return a pair (b,E(Q)). By 
the inductive hypothesis, -E(Q) : c/>i ([&]). By Lemma [57TT since [o] p is not type-like, 
E(Q) : fa, so (6, £(Q)) : T(bool) x~0 = 3a € 2. fa(a). 

— nai. Apply NEP to V to get a natural number n and a proof Q of </>i(rl). Return a 
pair (n, E'(Q)). By the inductive hypothesis, E(Q) : (pi(n). By Lemma [5TT1 since we 
can assume without loss of generality that n is not type-like, E(Q) : fa, so (n,E(Q)) : 
T(nat) x fa. 

— (r, a). Construct a proof Q of 3a\ G [t]3o2 G [cr]]. a = (ai,a2) A 0i. Let M = 
E(Q). By the inductive hypothesis M is a pair (Mi,M2) such that Mi : T(r) and 
M2 : 3ci2 G [a], a = (01,02) A</>i. Therefore M2 is a pair (M2i,M22), M21 : T(a) and 
M22 : a = (01,02) A fa. Therefore M22 is a pair (N,0), where O : fa. Therefore 
(Mi, M 2 i) ■ T(t x a), so ((Mi, M21), O) : T(r X a) X fa and we are justified to return 
((Mi,M 2 i),0). 

— T+a. Construct a proof Q of (3a G [r]. c/>i) V(3a G [<r]. 0i). Apply DP to get the proof 
Qi of (without loss of generality) 3a € [r]. fa. Let M = E'(Qi). By the inductive 
hypothesis, M = (Mi,M 2 ), where Mi : T(r) and M 2 : 0T. Return (m/(Mi),M 2 ), 
which is of type (T(r + <r) , fa ) . 

— t — ► cr. Use TEP to get a term / such that (/ G [r] -> [cr]) A 0i(/). Construct 
proofs Qi of Vx € [r]3y G [cr]./(x) = y and Q2 of fa(f). Without loss of generality, 
we can assume that / is not type-like. By the inductive hypothesis and Lemma 15.11 
E^Qi) '■ (f>- Let G be a function which works as follows: G takes a pair (t,lZ) such 
that TZ h t G [tJ, applies Qi to t, 72. to get a proof 7£i of 3y G [cr]. f(t) = y and 
calls E(TZi) to get a term M. By the inductive hypothesis, M : 3y G [<r]. f(t) = y, 
so M = (M\,M2), where Mi : T(a). The function G returns Mi. Our extraction 
procedure E(V) returns (G,E[Q 2 ))- The type of (G,E(Q 2 )) is (Q T -» T(a)) x ^ 
which is exactly (T(r — >• cr)) x fa. 

• 3a G A </>i(a), where ^4 is not type-like. Return *. 

• 3a. fa(a). Return *. 

• Va G A. fa(a), where A is type-like. Return a function G which takes an element (t, Q) 
of QType(A)-, applies V to t and Q to get a proof R of fa(t), and returns E(TZ). Without 
loss of generality, we can assume that t is not type-like. By the inductive hypothesis and 
Lemma [5TT1 E(TZ) : fa, so G : QT ype (A) ~ * fa = Va G A. fa(a). 

• Va G A </>i(a), where ^4 is not type-like. Return *. 

• Va. fa(a). Return *. 

5.2. HOL extraction. As in case of IZF, we will show how to do extraction from a subclass 
of CHOL proofs. The choice of the subclass is largely arbitrary, our choice illustrates the 
method and can be easily extended. 

We say that a CHOL formula is extractable if it is generated by the following abstract 
grammar, where r varies over pure TT° types and G {A, V, — >}. 

(fi ::= Vx : t. (p \ 3x : t. cp \ (p ® <p \ ± \ t = t 

We will define extraction for CHOL proofs of extractable formulas. By Theorem I3.11| 
if CHOL h (j), then IZF h G [c/>]. We need to slightly transform this IZF proof in 
order to come up with a valid input to E from the previous section. To this means, 



for any extractable 0(ai, . . ., a n ) we define an IZF formula 0'(&i, . . ., b n ) such that IZF 

h o e [if 

tti, ■ ■ ■,CLn)Jp[a 1 :=b 1 ,...,a n :=b n ] 0' '• The formula 0' is essentially with type 
membership information replaced by set membership information. We define 0' by induction 
on 0, checking the correctness on the way. We work in IZF. Let p' = p[a\ := b\, . . ., a n := b n ]. 
Thus we want to show IZF h G [0]p' *-* 4>' ■ Case of: 

• _L. Take 0' = G I-L]p'- The correctness is trivial. 

• t = s. Take 0' = G [i = The correctness is trivial. 

• 4>\ V 02- By the inductive hypothesis we get 0' x and 2 such that G [0i]p' <-> 0'i and 
G {fa},,, «-» 2 . Take 0' = 0' x V 0' 2 . We have G [0i V fa], iff G [0ijp> or G [0 2 ]p/ 
iff <p'i V 2 , which shows the claim. 

• 0i A 02- By the inductive hypothesis we get 0^ and 2 such that G [</>i]o / *-* <j>'\ and 
G \4>2\p' *-* 4>2- Set 4>' = 0j A 02 • The correctness follows easily. 

• 0i — ► 02- By the inductive hypothesis we get 0' x such that G [0i] p ' <-> 0i and 0' 2 such 
that G [02]p' *-* 02- Set 0' = 4> x — > 0' 2 . The correctness follows easily. 

• Va : r. 0i(a, ai, . . ., a„). By the inductive hypothesis we get 0^(6, 6i, . . ., 6 n ) such that 
V6,6i,...,6 n , G [0i] p '[ a:=6 ] <-»■ 0i- Set 0' = Va G [r]. 0'^a, &i, . . ., 6„). For the cor- 
rectness, we have G [Va : r. 0i(a, ai, . . ., a n )J p i iff V^4 G [r], G [0i]p'[ a :=A]- By the 
inductive hypothesis, this is equivalent to VA G [r]. 0^(^4,61, . . .,b n ) which is precisely 
0'i. 

• 3a : t. 0i. By the inductive hypothesis we get 0'i(6, bi,...,b n ) such that 

V6, 6i, . . ., b n . G [0i]p'[ a: =6] ^ 01- 

Set 0' = 3a G [r]. 0'i(a, 6i, . . ., The correctness follows as in the previous case. 
Now we can finally define the extraction process. Suppose CHOL h 0, where is 
closed and extractable. Let p be the empty environment. Using the soundness theorem, 
construct an IZF proof P that G [0]p. Use the definition above to get 0' such that IZF 
h G [0]p <-> 0' and using P obtain a proof R of 0'. Finally, apply the extraction function 
-E to R to get the computational extract. 

5.3. Implementation issues. The extraction process is parameterized by the implemen- 
tation of NEP, DP and TEP for IZF. Obviously, searching through all IZF proofs to get a 
witnessing natural number, term or a disjunct would not be a very effective method. We 
discuss two alternative approaches. 

The first approach is based on realizability. Rathjen defines a realizability relation 
in [Rat 05 j for weaker, predicative constructive set theory CZF. For any CZF proof of a 
formula 0, there is a realizer e such that the realizability relation e lh holds, moreover, 
this realizer can be found constructively from the proof. Realizers provide the information 
for DP and NEP — which of the disjuncts holds and the witnessing number. They could 
be implemented using lambda terms. These results have been also recently extended to IZF 
[Rat06| . The approach has the drawback of not providing the proof of TEP, which would 
restrict the extraction process from statements of the form 3x G [r|. to atomic types r. 
Moreover, the gap between the existing theoretical result and possible implementation is 
quite wide. 

The second, more direct approach is based on Moczydlowski's proof in [Moc06a] of weak 
normalization of the lambda calculus XZ corresponding to proofs in IZF. The normalization 
is used to prove NEP, DP and TEP for the theory and the necessary information is extracted 



from the normal form of the lambda term corresponding to the IZF proof. Thus in order 
to provide the implementation of DP, NEP and TEP for IZF, it would suffice to implement 
XZ, which is specified completely in [Moc06a, Moc06b]. 

An alternative approach has been presented by Berghofer [Ber04j . He defines extraction 
for a constructive variant of HOL logic directly in the generic theorem prover Isabelle and 
uses realizability to justify its correctness. His approach could likely be tailored to our 
CHOL, so that it would yield extracts equivalent to ours. An exciting project would be to 
formalize IZF and both methods of extraction in Isabelle and show their equivalence and 
correctness. 



6. Conclusion 

We have presented a computational semantics for HOL via standard interpretation in 
intuitionistic set theory. The semantics is clean, simple and agrees with the standard one. 

The advantage of this approach is that the extraction mechanism is completely external 
to Constructive HOL. Using only the semantics, we can take any constructive HOL proof 
and extract from it computational information. No enrichment of the logic in normalizing 
proof terms is necessary. 

The separation of the extraction mechanism from the logic makes the logic very easily 
extendable. For example, inductive datatypes and subtyping have clean set-theoretic se- 
mantics, so can easily be added to HOL preserving consistency, as witnessed in PVS. As 
the semantics would work constructively, the extraction mechanisms from section [5] could 
be easily adapted to incorporate them. Similarly, one could define a set-theoretic semantics 
for the constructive version of HOL implemented in Isabelle ( [Ber04j IBN02] ) in the same 
spirit, with the same advantages. 

The modularity of our approach and the fact that it is much easier to give set-theoretic 
semantics for the logic than to prove normalization, could make the development of new 
trustworthy provers with extraction capabilities much easier and faster. 

We would like to thank anonymous reviewers for their helpful comments. 
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